ARtPM: Article Retrieval for Precision Medicine

J Biomed Inform. 2019 Jul:95:103224. doi: 10.1016/j.jbi.2019.103224. Epub 2019 Jun 11.

Abstract

Background: Information curation and literature surveillance efforts that synthesize the current knowledge about the impact of genetic variability on disease states and drug responses are vitally important for the practise of evidence-based precision medicine. For these efforts, finding the relevant and comprehensive set of articles from the ever growing scientific literature is a challenge.

Methods: We have designed and developed Article Retrieval for Precision Medicine (ARtPM), an end-to-end article retrieval system that employs multi-stage architecture to retrieve and rank relevant articles for a given medical case summary (genetic variants, disease, demographic, and other medical conditions). We compared ARtPM with five baselines, including PubMed Best Match, the improved search functionality recently introduced by PubMed.

Results: The differences in the performance of ARtPM and five baselines were statistically significant for four metrics that quantify different aspects of search effectiveness (P-values for P@10, R-prec, infNDCG, Recall@1000 were <.001, <.001,.003,.009, respectively). Pairwise systems' comparisons show that ARtPM is comparable or better than the best performing baseline on three metrics (R-prec: 0.324 vs 0.299, P-value=.06; infNDCG: 0.556 vs 0.465, P-value=.08; R@1000: 0.665 vs 0.572, P-value=.007), but performance in P@10 (0.603 vs 0.630, P-value:.64) needs to improve.

Conclusion: The recall-focused phase of the ARtPM is effective at retrieving more relevant articles. The precision-focused ranking phase performs well at deeper ranks but needs further work on early ranks (e.g., richer feature set). Overall, the ARtPM system effectively facilitates evidence-based precision medicine practice, and provides a robust search framework for further work in this direction.

Keywords: Biomedical information retrieval; Biomedical knowledge curation; Document retrieval; Learning to rank; Precision medicine; Query expansion.

MeSH terms

  • Biomedical Research
  • Data Curation
  • Databases, Factual
  • Humans
  • Information Storage and Retrieval / methods*
  • Periodicals as Topic
  • Precision Medicine*